Dataset statistics
| Number of variables | 27 |
|---|---|
| Number of observations | 899164 |
| Missing cells | 751259 |
| Missing cells (%) | 3.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 185.2 MiB |
| Average record size in memory | 216.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 14 |
| DateTime | 1 |
Name has a high cardinality: 779583 distinct values | High cardinality |
City has a high cardinality: 32581 distinct values | High cardinality |
State has a high cardinality: 51 distinct values | High cardinality |
Bank has a high cardinality: 5802 distinct values | High cardinality |
BankState has a high cardinality: 56 distinct values | High cardinality |
ApprovalDate has a high cardinality: 9859 distinct values | High cardinality |
ApprovalFY has a high cardinality: 52 distinct values | High cardinality |
ChgOffDate has a high cardinality: 6448 distinct values | High cardinality |
Term is highly correlated with GrAppv and 1 other fields | High correlation |
CreateJob is highly correlated with RetainedJob | High correlation |
RetainedJob is highly correlated with CreateJob | High correlation |
DisbursementGross is highly correlated with GrAppv and 1 other fields | High correlation |
GrAppv is highly correlated with Term and 2 other fields | High correlation |
SBA_Appv is highly correlated with Term and 2 other fields | High correlation |
Term is highly correlated with DisbursementGross and 2 other fields | High correlation |
DisbursementGross is highly correlated with Term and 2 other fields | High correlation |
GrAppv is highly correlated with Term and 2 other fields | High correlation |
SBA_Appv is highly correlated with Term and 2 other fields | High correlation |
LoanNr_ChkDgt is highly correlated with ApprovalFY | High correlation |
State is highly correlated with Zip and 1 other fields | High correlation |
Zip is highly correlated with State and 1 other fields | High correlation |
BankState is highly correlated with State and 1 other fields | High correlation |
NAICS is highly correlated with ApprovalFY and 1 other fields | High correlation |
ApprovalFY is highly correlated with LoanNr_ChkDgt and 4 other fields | High correlation |
Term is highly correlated with MIS_Status | High correlation |
CreateJob is highly correlated with ApprovalFY and 1 other fields | High correlation |
RetainedJob is highly correlated with CreateJob | High correlation |
UrbanRural is highly correlated with NAICS and 2 other fields | High correlation |
RevLineCr is highly correlated with ApprovalFY and 1 other fields | High correlation |
DisbursementGross is highly correlated with GrAppv and 1 other fields | High correlation |
MIS_Status is highly correlated with Term | High correlation |
GrAppv is highly correlated with DisbursementGross and 1 other fields | High correlation |
SBA_Appv is highly correlated with DisbursementGross and 1 other fields | High correlation |
State is highly correlated with BankState | High correlation |
UrbanRural is highly correlated with ApprovalFY | High correlation |
BankState is highly correlated with State | High correlation |
ApprovalFY is highly correlated with UrbanRural | High correlation |
ChgOffDate has 736465 (81.9%) missing values | Missing |
NoEmp is highly skewed (γ1 = 80.24824355) | Skewed |
CreateJob is highly skewed (γ1 = 36.99135473) | Skewed |
RetainedJob is highly skewed (γ1 = 36.85481184) | Skewed |
LoanNr_ChkDgt has unique values | Unique |
NAICS has 201948 (22.5%) zeros | Zeros |
CreateJob has 629248 (70.0%) zeros | Zeros |
RetainedJob has 440403 (49.0%) zeros | Zeros |
FranchiseCode has 208835 (23.2%) zeros | Zeros |
ChgOffPrinGr has 737152 (82.0%) zeros | Zeros |
Reproduction
| Analysis started | 2022-06-22 00:59:41.766961 |
|---|---|
| Analysis finished | 2022-06-22 01:02:22.207278 |
| Duration | 2 minutes and 40.44 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 899164 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4772612311 |
| Minimum | 1000014003 |
|---|---|
| Maximum | 9996003010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 1000014003 |
|---|---|
| 5-th percentile | 1348457210 |
| Q1 | 2589757508 |
| median | 4361439006 |
| Q3 | 6904626505 |
| 95-th percentile | 9164803856 |
| Maximum | 9996003010 |
| Range | 8995989007 |
| Interquartile range (IQR) | 4314868996 |
Descriptive statistics
| Standard deviation | 2538175037 |
|---|---|
| Coefficient of variation (CV) | 0.5318209132 |
| Kurtosis | -1.086498977 |
| Mean | 4772612311 |
| Median Absolute Deviation (MAD) | 2013400000 |
| Skewness | 0.364757102 |
| Sum | 4.291361176 × 1015 |
| Variance | 6.442332521 × 1018 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1000014003 | 1 | < 0.1% |
| 5944984007 | 1 | < 0.1% |
| 5944874009 | 1 | < 0.1% |
| 5944884001 | 1 | < 0.1% |
| 5944904005 | 1 | < 0.1% |
| 5944914008 | 1 | < 0.1% |
| 5944924000 | 1 | < 0.1% |
| 5944934003 | 1 | < 0.1% |
| 5944944006 | 1 | < 0.1% |
| 5944954009 | 1 | < 0.1% |
| Other values (899154) | 899154 |
| Value | Count | Frequency (%) |
| 1000014003 | 1 | |
| 1000024006 | 1 | |
| 1000034009 | 1 | |
| 1000044001 | 1 | |
| 1000054004 | 1 | |
| 1000084002 | 1 | |
| 1000093009 | 1 | |
| 1000094005 | 1 | |
| 1000104006 | 1 | |
| 1000124001 | 1 |
| Value | Count | Frequency (%) |
| 9996003010 | 1 | |
| 9995973006 | 1 | |
| 9995613003 | 1 | |
| 9995603000 | 1 | |
| 9995573004 | 1 | |
| 9995563001 | 1 | |
| 9995493004 | 1 | |
| 9995473009 | 1 | |
| 9995453003 | 1 | |
| 9995423005 | 1 |
| Distinct | 779583 |
|---|---|
| Distinct (%) | 86.7% |
| Missing | 14 |
| Missing (%) | < 0.1% |
| Memory size | 6.9 MiB |
| SUBWAY | 1269 |
|---|---|
| QUIZNO'S SUBS | 433 |
| COLD STONE CREAMERY | 366 |
| QUIZNO'S | 345 |
| DOMINO'S PIZZA | 329 |
| Other values (779578) |
Length
| Max length | 30 |
|---|---|
| Median length | 23 |
| Mean length | 21.77596285 |
| Min length | 1 |
Characters and Unicode
| Total characters | 19579857 |
|---|---|
| Distinct characters | 91 |
| Distinct categories | 12 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 706468 ? |
|---|---|
| Unique (%) | 78.6% |
Sample
| 1st row | ABC HOBBYCRAFT |
|---|---|
| 2nd row | LANDMARK BAR & GRILLE (THE) |
| 3rd row | WHITLOCK DDS, TODD M. |
| 4th row | BIG BUCKS PAWN & JEWELRY, LLC |
| 5th row | ANASTASIA CONFECTIONS, INC. |
Common Values
| Value | Count | Frequency (%) |
| SUBWAY | 1269 | 0.1% |
| QUIZNO'S SUBS | 433 | < 0.1% |
| COLD STONE CREAMERY | 366 | < 0.1% |
| QUIZNO'S | 345 | < 0.1% |
| DOMINO'S PIZZA | 329 | < 0.1% |
| DAIRY QUEEN | 328 | < 0.1% |
| THE UPS STORE | 323 | < 0.1% |
| DUNKIN DONUTS | 299 | < 0.1% |
| MATCO TOOLS | 288 | < 0.1% |
| MAIL BOXES ETC | 280 | < 0.1% |
| Other values (779573) | 894890 |
Length
| Value | Count | Frequency (%) |
| inc | 263379 | 8.4% |
| 100280 | 3.2% | |
| llc | 77826 | 2.5% |
| and | 28959 | 0.9% |
| the | 28389 | 0.9% |
| of | 23026 | 0.7% |
| dba | 20214 | 0.6% |
| co | 18216 | 0.6% |
| a | 18114 | 0.6% |
| services | 17318 | 0.6% |
| Other values (226643) | 2530176 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2231639 | 11.4% | |
| E | 1354056 | 6.9% |
| I | 1226719 | 6.3% |
| A | 1177821 | 6.0% |
| N | 1170319 | 6.0% |
| R | 1052562 | 5.4% |
| C | 1038114 | 5.3% |
| S | 1009495 | 5.2% |
| O | 933206 | 4.8% |
| T | 917437 | 4.7% |
| Other values (81) | 7468489 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 14311292 | |
| Lowercase Letter | 2249775 | 11.5% |
| Space Separator | 2231639 | 11.4% |
| Other Punctuation | 712203 | 3.6% |
| Decimal Number | 38461 | 0.2% |
| Dash Punctuation | 29147 | 0.1% |
| Open Punctuation | 3600 | < 0.1% |
| Close Punctuation | 2973 | < 0.1% |
| Math Symbol | 498 | < 0.1% |
| Currency Symbol | 198 | < 0.1% |
| Other values (2) | 71 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 1354056 | 9.5% |
| I | 1226719 | 8.6% |
| A | 1177821 | 8.2% |
| N | 1170319 | 8.2% |
| R | 1052562 | 7.4% |
| C | 1038114 | 7.3% |
| S | 1009495 | 7.1% |
| O | 933206 | 6.5% |
| T | 917437 | 6.4% |
| L | 840208 | 5.9% |
| Other values (16) | 3591355 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 250402 | |
| n | 238175 | |
| a | 206694 | |
| r | 187739 | 8.3% |
| i | 180961 | 8.0% |
| o | 178702 | 7.9% |
| t | 151259 | 6.7% |
| s | 141102 | 6.3% |
| c | 123850 | 5.5% |
| l | 107780 | 4.8% |
| Other values (16) | 483111 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 273453 | |
| , | 244641 | |
| & | 104166 | 14.6% |
| ' | 73757 | 10.4% |
| / | 10119 | 1.4% |
| # | 3514 | 0.5% |
| " | 906 | 0.1% |
| ! | 473 | 0.1% |
| : | 411 | 0.1% |
| * | 244 | < 0.1% |
| Other values (5) | 519 | 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 7572 | |
| 2 | 6295 | |
| 0 | 4730 | |
| 3 | 3993 | |
| 4 | 3678 | |
| 5 | 2715 | 7.1% |
| 8 | 2585 | 6.7% |
| 6 | 2467 | 6.4% |
| 7 | 2234 | 5.8% |
| 9 | 2192 | 5.7% |
Math Symbol
| Value | Count | Frequency (%) |
| + | 468 | |
| = | 16 | 3.2% |
| > | 9 | 1.8% |
| < | 5 | 1.0% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 3597 | |
| [ | 3 | 0.1% |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2972 | |
| ] | 1 | < 0.1% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 64 | |
| ^ | 4 | 5.9% |
Space Separator
| Value | Count | Frequency (%) |
| 2231639 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 29147 |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 198 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 3 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 16561067 | |
| Common | 3018790 | 15.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 1354056 | 8.2% |
| I | 1226719 | 7.4% |
| A | 1177821 | 7.1% |
| N | 1170319 | 7.1% |
| R | 1052562 | 6.4% |
| C | 1038114 | 6.3% |
| S | 1009495 | 6.1% |
| O | 933206 | 5.6% |
| T | 917437 | 5.5% |
| L | 840208 | 5.1% |
| Other values (42) | 5841130 |
Common
| Value | Count | Frequency (%) |
| 2231639 | ||
| . | 273453 | 9.1% |
| , | 244641 | 8.1% |
| & | 104166 | 3.5% |
| ' | 73757 | 2.4% |
| - | 29147 | 1.0% |
| / | 10119 | 0.3% |
| 1 | 7572 | 0.3% |
| 2 | 6295 | 0.2% |
| 0 | 4730 | 0.2% |
| Other values (29) | 33271 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19579857 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2231639 | 11.4% | |
| E | 1354056 | 6.9% |
| I | 1226719 | 6.3% |
| A | 1177821 | 6.0% |
| N | 1170319 | 6.0% |
| R | 1052562 | 5.4% |
| C | 1038114 | 5.3% |
| S | 1009495 | 5.2% |
| O | 933206 | 4.8% |
| T | 917437 | 4.7% |
| Other values (81) | 7468489 |
| Distinct | 32581 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 30 |
| Missing (%) | < 0.1% |
| Memory size | 6.9 MiB |
| LOS ANGELES | 11558 |
|---|---|
| HOUSTON | 10247 |
| NEW YORK | 7846 |
| CHICAGO | 6036 |
| MIAMI | 5594 |
| Other values (32576) |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 9.103062502 |
| Min length | 1 |
Characters and Unicode
| Total characters | 8184873 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 9 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 12872 ? |
|---|---|
| Unique (%) | 1.4% |
Sample
| 1st row | EVANSVILLE |
|---|---|
| 2nd row | NEW PARIS |
| 3rd row | BLOOMINGTON |
| 4th row | BROKEN ARROW |
| 5th row | ORLANDO |
Common Values
| Value | Count | Frequency (%) |
| LOS ANGELES | 11558 | 1.3% |
| HOUSTON | 10247 | 1.1% |
| NEW YORK | 7846 | 0.9% |
| CHICAGO | 6036 | 0.7% |
| MIAMI | 5594 | 0.6% |
| SAN DIEGO | 5363 | 0.6% |
| DALLAS | 5085 | 0.6% |
| PHOENIX | 4493 | 0.5% |
| LAS VEGAS | 4390 | 0.5% |
| SPRINGFIELD | 3738 | 0.4% |
| Other values (32571) | 834784 |
Length
| Value | Count | Frequency (%) |
| city | 23831 | 2.0% |
| san | 21942 | 1.8% |
| new | 16075 | 1.3% |
| los | 13000 | 1.1% |
| angeles | 12380 | 1.0% |
| lake | 10729 | 0.9% |
| houston | 10587 | 0.9% |
| beach | 10462 | 0.9% |
| park | 10316 | 0.9% |
| york | 9724 | 0.8% |
| Other values (17695) | 1066583 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 744405 | 9.1% |
| E | 723098 | 8.8% |
| O | 632510 | 7.7% |
| N | 621338 | 7.6% |
| L | 573578 | 7.0% |
| R | 513614 | 6.3% |
| S | 475392 | 5.8% |
| I | 468344 | 5.7% |
| T | 425108 | 5.2% |
| 306936 | 3.8% | |
| Other values (70) | 2700550 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 7442897 | |
| Lowercase Letter | 398062 | 4.9% |
| Space Separator | 306936 | 3.8% |
| Open Punctuation | 14884 | 0.2% |
| Other Punctuation | 11120 | 0.1% |
| Close Punctuation | 9119 | 0.1% |
| Dash Punctuation | 946 | < 0.1% |
| Decimal Number | 870 | < 0.1% |
| Modifier Symbol | 39 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 744405 | 10.0% |
| E | 723098 | 9.7% |
| O | 632510 | 8.5% |
| N | 621338 | 8.3% |
| L | 573578 | 7.7% |
| R | 513614 | 6.9% |
| S | 475392 | 6.4% |
| I | 468344 | 6.3% |
| T | 425108 | 5.7% |
| C | 262549 | 3.5% |
| Other values (16) | 2002961 |
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 43411 | |
| a | 41550 | |
| n | 36545 | |
| o | 36384 | |
| l | 32699 | 8.2% |
| i | 30470 | 7.7% |
| r | 29637 | 7.4% |
| t | 24529 | 6.2% |
| s | 21884 | 5.5% |
| d | 12360 | 3.1% |
| Other values (16) | 88593 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 8672 | |
| , | 1215 | 10.9% |
| ' | 1134 | 10.2% |
| : | 29 | 0.3% |
| & | 22 | 0.2% |
| / | 21 | 0.2% |
| ; | 18 | 0.2% |
| # | 5 | < 0.1% |
| @ | 2 | < 0.1% |
| * | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 153 | |
| 1 | 145 | |
| 2 | 113 | |
| 5 | 90 | |
| 4 | 86 | |
| 3 | 78 | |
| 6 | 63 | |
| 9 | 51 | 5.9% |
| 8 | 49 | 5.6% |
| 7 | 42 | 4.8% |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 14879 | |
| [ | 5 | < 0.1% |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 38 | |
| ^ | 1 | 2.6% |
Space Separator
| Value | Count | Frequency (%) |
| 306936 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 9119 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 946 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7840959 | |
| Common | 343914 | 4.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 744405 | 9.5% |
| E | 723098 | 9.2% |
| O | 632510 | 8.1% |
| N | 621338 | 7.9% |
| L | 573578 | 7.3% |
| R | 513614 | 6.6% |
| S | 475392 | 6.1% |
| I | 468344 | 6.0% |
| T | 425108 | 5.4% |
| C | 262549 | 3.3% |
| Other values (42) | 2401023 |
Common
| Value | Count | Frequency (%) |
| 306936 | ||
| ( | 14879 | 4.3% |
| ) | 9119 | 2.7% |
| . | 8672 | 2.5% |
| , | 1215 | 0.4% |
| ' | 1134 | 0.3% |
| - | 946 | 0.3% |
| 0 | 153 | < 0.1% |
| 1 | 145 | < 0.1% |
| 2 | 113 | < 0.1% |
| Other values (18) | 602 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8184873 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 744405 | 9.1% |
| E | 723098 | 8.8% |
| O | 632510 | 7.7% |
| N | 621338 | 7.6% |
| L | 573578 | 7.0% |
| R | 513614 | 6.3% |
| S | 475392 | 5.8% |
| I | 468344 | 5.7% |
| T | 425108 | 5.2% |
| 306936 | 3.8% | |
| Other values (70) | 2700550 |
| Distinct | 51 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 14 |
| Missing (%) | < 0.1% |
| Memory size | 6.9 MiB |
| CA | |
|---|---|
| TX | |
| NY | |
| FL | 41212 |
| PA | 35170 |
| Other values (46) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1798300 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | IN |
|---|---|
| 2nd row | IN |
| 3rd row | IN |
| 4th row | OK |
| 5th row | FL |
Common Values
| Value | Count | Frequency (%) |
| CA | 130619 | 14.5% |
| TX | 70458 | 7.8% |
| NY | 57693 | 6.4% |
| FL | 41212 | 4.6% |
| PA | 35170 | 3.9% |
| OH | 32622 | 3.6% |
| IL | 29669 | 3.3% |
| MA | 25272 | 2.8% |
| MN | 24373 | 2.7% |
| NJ | 24035 | 2.7% |
| Other values (41) | 428027 |
Length
| Value | Count | Frequency (%) |
| ca | 130619 | 14.5% |
| tx | 70458 | 7.8% |
| ny | 57693 | 6.4% |
| fl | 41212 | 4.6% |
| pa | 35170 | 3.9% |
| oh | 32622 | 3.6% |
| il | 29669 | 3.3% |
| ma | 25272 | 2.8% |
| mn | 24373 | 2.7% |
| nj | 24035 | 2.7% |
| Other values (41) | 428027 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 306176 | |
| C | 184957 | |
| N | 181727 | |
| M | 132549 | 7.4% |
| T | 125069 | 7.0% |
| I | 119518 | 6.6% |
| O | 94906 | 5.3% |
| L | 88819 | 4.9% |
| X | 70458 | 3.9% |
| Y | 68255 | 3.8% |
| Other values (14) | 425866 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1798300 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 306176 | |
| C | 184957 | |
| N | 181727 | |
| M | 132549 | 7.4% |
| T | 125069 | 7.0% |
| I | 119518 | 6.6% |
| O | 94906 | 5.3% |
| L | 88819 | 4.9% |
| X | 70458 | 3.9% |
| Y | 68255 | 3.8% |
| Other values (14) | 425866 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1798300 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 306176 | |
| C | 184957 | |
| N | 181727 | |
| M | 132549 | 7.4% |
| T | 125069 | 7.0% |
| I | 119518 | 6.6% |
| O | 94906 | 5.3% |
| L | 88819 | 4.9% |
| X | 70458 | 3.9% |
| Y | 68255 | 3.8% |
| Other values (14) | 425866 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1798300 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 306176 | |
| C | 184957 | |
| N | 181727 | |
| M | 132549 | 7.4% |
| T | 125069 | 7.0% |
| I | 119518 | 6.6% |
| O | 94906 | 5.3% |
| L | 88819 | 4.9% |
| X | 70458 | 3.9% |
| Y | 68255 | 3.8% |
| Other values (14) | 425866 |
| Distinct | 33611 |
|---|---|
| Distinct (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53804.39124 |
| Minimum | 0 |
|---|---|
| Maximum | 99999 |
| Zeros | 283 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3838 |
| Q1 | 27587 |
| median | 55410 |
| Q3 | 83704 |
| 95-th percentile | 95822 |
| Maximum | 99999 |
| Range | 99999 |
| Interquartile range (IQR) | 56117 |
Descriptive statistics
| Standard deviation | 31184.15915 |
|---|---|
| Coefficient of variation (CV) | 0.5795839044 |
| Kurtosis | -1.335989332 |
| Mean | 53804.39124 |
| Median Absolute Deviation (MAD) | 28206 |
| Skewness | -0.1681666308 |
| Sum | 4.837897165 × 1010 |
| Variance | 972451782 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10001 | 933 | 0.1% |
| 90015 | 926 | 0.1% |
| 93401 | 806 | 0.1% |
| 90010 | 733 | 0.1% |
| 33166 | 671 | 0.1% |
| 90021 | 666 | 0.1% |
| 59601 | 640 | 0.1% |
| 65804 | 599 | 0.1% |
| 3801 | 581 | 0.1% |
| 59101 | 578 | 0.1% |
| Other values (33601) | 892031 |
| Value | Count | Frequency (%) |
| 0 | 283 | |
| 1 | 24 | < 0.1% |
| 2 | 11 | < 0.1% |
| 3 | 5 | < 0.1% |
| 4 | 5 | < 0.1% |
| 5 | 5 | < 0.1% |
| 6 | 4 | < 0.1% |
| 7 | 6 | < 0.1% |
| 8 | 15 | < 0.1% |
| 9 | 24 | < 0.1% |
| Value | Count | Frequency (%) |
| 99999 | 209 | |
| 99950 | 3 | < 0.1% |
| 99929 | 15 | < 0.1% |
| 99928 | 1 | < 0.1% |
| 99926 | 1 | < 0.1% |
| 99925 | 4 | < 0.1% |
| 99923 | 1 | < 0.1% |
| 99921 | 13 | < 0.1% |
| 99919 | 2 | < 0.1% |
| 99918 | 1 | < 0.1% |
| Distinct | 5802 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 1559 |
| Missing (%) | 0.2% |
| Memory size | 6.9 MiB |
| BANK OF AMERICA NATL ASSOC | |
|---|---|
| WELLS FARGO BANK NATL ASSOC | |
| JPMORGAN CHASE BANK NATL ASSOC | 48167 |
| U.S. BANK NATIONAL ASSOCIATION | 35143 |
| CITIZENS BANK NATL ASSOC | 35054 |
| Other values (5797) |
Length
| Max length | 30 |
|---|---|
| Median length | 26 |
| Mean length | 23.1879457 |
| Min length | 3 |
Characters and Unicode
| Total characters | 20813616 |
|---|---|
| Distinct characters | 50 |
| Distinct categories | 8 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 923 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | FIFTH THIRD BANK |
|---|---|
| 2nd row | 1ST SOURCE BANK |
| 3rd row | GRANT COUNTY STATE BANK |
| 4th row | 1ST NATL BK & TR CO OF BROKEN |
| 5th row | FLORIDA BUS. DEVEL CORP |
Common Values
| Value | Count | Frequency (%) |
| BANK OF AMERICA NATL ASSOC | 86853 | 9.7% |
| WELLS FARGO BANK NATL ASSOC | 63503 | 7.1% |
| JPMORGAN CHASE BANK NATL ASSOC | 48167 | 5.4% |
| U.S. BANK NATIONAL ASSOCIATION | 35143 | 3.9% |
| CITIZENS BANK NATL ASSOC | 35054 | 3.9% |
| PNC BANK, NATIONAL ASSOCIATION | 27351 | 3.0% |
| BBCN BANK | 22978 | 2.6% |
| CAPITAL ONE NATL ASSOC | 22248 | 2.5% |
| MANUFACTURERS & TRADERS TR CO | 11265 | 1.3% |
| READYCAP LENDING, LLC | 10664 | 1.2% |
| Other values (5792) | 534379 |
Length
| Value | Count | Frequency (%) |
| bank | 651608 | |
| natl | 318240 | 9.0% |
| assoc | 306768 | 8.7% |
| of | 142852 | 4.1% |
| national | 125899 | 3.6% |
| america | 100686 | 2.9% |
| association | 84965 | 2.4% |
| fargo | 63732 | 1.8% |
| wells | 63650 | 1.8% |
| 52264 | 1.5% | |
| Other values (3602) | 1606709 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 2762231 | |
| 2620014 | ||
| N | 2105500 | |
| S | 1520499 | 7.3% |
| O | 1336993 | 6.4% |
| T | 1181841 | 5.7% |
| C | 1134642 | 5.5% |
| I | 1061717 | 5.1% |
| E | 923739 | 4.4% |
| L | 922583 | 4.4% |
| Other values (40) | 5243857 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 17830764 | |
| Space Separator | 2620014 | 12.6% |
| Other Punctuation | 341354 | 1.6% |
| Dash Punctuation | 10861 | 0.1% |
| Decimal Number | 9482 | < 0.1% |
| Open Punctuation | 584 | < 0.1% |
| Close Punctuation | 555 | < 0.1% |
| Math Symbol | 2 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 2762231 | |
| N | 2105500 | |
| S | 1520499 | 8.5% |
| O | 1336993 | 7.5% |
| T | 1181841 | 6.6% |
| C | 1134642 | 6.4% |
| I | 1061717 | 6.0% |
| E | 923739 | 5.2% |
| L | 922583 | 5.2% |
| B | 893994 | 5.0% |
| Other values (16) | 3987025 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 5538 | |
| 5 | 1268 | 13.4% |
| 0 | 1258 | 13.3% |
| 4 | 1222 | 12.9% |
| 2 | 112 | 1.2% |
| 7 | 33 | 0.3% |
| 3 | 24 | 0.3% |
| 9 | 17 | 0.2% |
| 8 | 7 | 0.1% |
| 6 | 3 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 192998 | |
| , | 94677 | |
| & | 50021 | 14.7% |
| / | 1833 | 0.5% |
| ' | 1811 | 0.5% |
| : | 10 | < 0.1% |
| # | 2 | < 0.1% |
| * | 1 | < 0.1% |
| % | 1 | < 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 2620014 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 10861 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 584 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 555 |
Math Symbol
| Value | Count | Frequency (%) |
| + | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17830764 | |
| Common | 2982852 | 14.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 2762231 | |
| N | 2105500 | |
| S | 1520499 | 8.5% |
| O | 1336993 | 7.5% |
| T | 1181841 | 6.6% |
| C | 1134642 | 6.4% |
| I | 1061717 | 6.0% |
| E | 923739 | 5.2% |
| L | 922583 | 5.2% |
| B | 893994 | 5.0% |
| Other values (16) | 3987025 |
Common
| Value | Count | Frequency (%) |
| 2620014 | ||
| . | 192998 | 6.5% |
| , | 94677 | 3.2% |
| & | 50021 | 1.7% |
| - | 10861 | 0.4% |
| 1 | 5538 | 0.2% |
| / | 1833 | 0.1% |
| ' | 1811 | 0.1% |
| 5 | 1268 | < 0.1% |
| 0 | 1258 | < 0.1% |
| Other values (14) | 2573 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 20813616 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 2762231 | |
| 2620014 | ||
| N | 2105500 | |
| S | 1520499 | 7.3% |
| O | 1336993 | 6.4% |
| T | 1181841 | 5.7% |
| C | 1134642 | 5.5% |
| I | 1061717 | 5.1% |
| E | 923739 | 4.4% |
| L | 922583 | 4.4% |
| Other values (40) | 5243857 |
| Distinct | 56 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1566 |
| Missing (%) | 0.2% |
| Memory size | 6.9 MiB |
| CA | |
|---|---|
| NC | |
| IL | |
| OH | |
| SD | 51095 |
| Other values (51) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1795196 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | OH |
|---|---|
| 2nd row | IN |
| 3rd row | IN |
| 4th row | OK |
| 5th row | FL |
Common Values
| Value | Count | Frequency (%) |
| CA | 118116 | 13.1% |
| NC | 79514 | 8.8% |
| IL | 65908 | 7.3% |
| OH | 58461 | 6.5% |
| SD | 51095 | 5.7% |
| TX | 47790 | 5.3% |
| RI | 45366 | 5.0% |
| NY | 39592 | 4.4% |
| VA | 29002 | 3.2% |
| DE | 24537 | 2.7% |
| Other values (46) | 338217 |
Length
| Value | Count | Frequency (%) |
| ca | 118116 | 13.2% |
| nc | 79514 | 8.9% |
| il | 65908 | 7.3% |
| oh | 58461 | 6.5% |
| sd | 51095 | 5.7% |
| tx | 47790 | 5.3% |
| ri | 45366 | 5.1% |
| ny | 39592 | 4.4% |
| va | 29002 | 3.2% |
| de | 24537 | 2.7% |
| Other values (46) | 338217 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 241398 | |
| C | 229604 | |
| N | 187751 | |
| I | 158854 | 8.8% |
| O | 102604 | 5.7% |
| L | 96914 | 5.4% |
| D | 96078 | 5.4% |
| T | 94941 | 5.3% |
| M | 85034 | 4.7% |
| S | 73385 | 4.1% |
| Other values (14) | 428633 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1795196 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 241398 | |
| C | 229604 | |
| N | 187751 | |
| I | 158854 | 8.8% |
| O | 102604 | 5.7% |
| L | 96914 | 5.4% |
| D | 96078 | 5.4% |
| T | 94941 | 5.3% |
| M | 85034 | 4.7% |
| S | 73385 | 4.1% |
| Other values (14) | 428633 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1795196 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 241398 | |
| C | 229604 | |
| N | 187751 | |
| I | 158854 | 8.8% |
| O | 102604 | 5.7% |
| L | 96914 | 5.4% |
| D | 96078 | 5.4% |
| T | 94941 | 5.3% |
| M | 85034 | 4.7% |
| S | 73385 | 4.1% |
| Other values (14) | 428633 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1795196 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 241398 | |
| C | 229604 | |
| N | 187751 | |
| I | 158854 | 8.8% |
| O | 102604 | 5.7% |
| L | 96914 | 5.4% |
| D | 96078 | 5.4% |
| T | 94941 | 5.3% |
| M | 85034 | 4.7% |
| S | 73385 | 4.1% |
| Other values (14) | 428633 |
| Distinct | 1312 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 398660.9501 |
| Minimum | 0 |
|---|---|
| Maximum | 928120 |
| Zeros | 201948 |
| Zeros (%) | 22.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 235210 |
| median | 445310 |
| Q3 | 561730 |
| 95-th percentile | 811192 |
| Maximum | 928120 |
| Range | 928120 |
| Interquartile range (IQR) | 326520 |
Descriptive statistics
| Standard deviation | 263318.3128 |
|---|---|
| Coefficient of variation (CV) | 0.6605069111 |
| Kurtosis | -1.047652612 |
| Mean | 398660.9501 |
| Median Absolute Deviation (MAD) | 176300 |
| Skewness | -0.2628783414 |
| Sum | 3.584615746 × 1011 |
| Variance | 6.933653383 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 201948 | 22.5% |
| 722110 | 27989 | 3.1% |
| 722211 | 19448 | 2.2% |
| 811111 | 14585 | 1.6% |
| 621210 | 14048 | 1.6% |
| 624410 | 10111 | 1.1% |
| 812112 | 9230 | 1.0% |
| 561730 | 8935 | 1.0% |
| 621310 | 8733 | 1.0% |
| 812320 | 7894 | 0.9% |
| Other values (1302) | 576243 |
| Value | Count | Frequency (%) |
| 0 | 201948 | |
| 111110 | 32 | < 0.1% |
| 111120 | 3 | < 0.1% |
| 111130 | 1 | < 0.1% |
| 111140 | 94 | < 0.1% |
| 111150 | 49 | < 0.1% |
| 111160 | 2 | < 0.1% |
| 111191 | 3 | < 0.1% |
| 111199 | 7 | < 0.1% |
| 111211 | 16 | < 0.1% |
| Value | Count | Frequency (%) |
| 928120 | 32 | |
| 928110 | 4 | < 0.1% |
| 927110 | 1 | < 0.1% |
| 926150 | 10 | < 0.1% |
| 926140 | 6 | < 0.1% |
| 926130 | 3 | < 0.1% |
| 926120 | 5 | < 0.1% |
| 926110 | 6 | < 0.1% |
| 925120 | 1 | < 0.1% |
| 925110 | 3 | < 0.1% |
| Distinct | 9859 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.9 MiB |
| 7-Jul-93 | 1131 |
|---|---|
| 30-Jan-04 | 1032 |
| 8-Jul-93 | 780 |
| 4-Oct-04 | 658 |
| 30-Sep-03 | 608 |
| Other values (9854) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.721139859 |
| Min length | 8 |
Characters and Unicode
| Total characters | 7841735 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 952 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 28-Feb-97 |
|---|---|
| 2nd row | 28-Feb-97 |
| 3rd row | 28-Feb-97 |
| 4th row | 28-Feb-97 |
| 5th row | 28-Feb-97 |
Common Values
| Value | Count | Frequency (%) |
| 7-Jul-93 | 1131 | 0.1% |
| 30-Jan-04 | 1032 | 0.1% |
| 8-Jul-93 | 780 | 0.1% |
| 4-Oct-04 | 658 | 0.1% |
| 30-Sep-03 | 608 | 0.1% |
| 30-Jun-05 | 572 | 0.1% |
| 18-Apr-05 | 534 | 0.1% |
| 6-Jul-93 | 523 | 0.1% |
| 21-Jan-05 | 498 | 0.1% |
| 27-Sep-02 | 497 | 0.1% |
| Other values (9849) | 892331 |
Length
| Value | Count | Frequency (%) |
| 7-jul-93 | 1131 | 0.1% |
| 30-jan-04 | 1032 | 0.1% |
| 8-jul-93 | 780 | 0.1% |
| 4-oct-04 | 658 | 0.1% |
| 30-sep-03 | 608 | 0.1% |
| 30-jun-05 | 572 | 0.1% |
| 18-apr-05 | 534 | 0.1% |
| 6-jul-93 | 523 | 0.1% |
| 21-jan-05 | 498 | 0.1% |
| 27-sep-02 | 497 | 0.1% |
| Other values (9849) | 892331 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 1798328 | |
| 0 | 687310 | 8.8% |
| 1 | 492781 | 6.3% |
| 9 | 470677 | 6.0% |
| 2 | 464364 | 5.9% |
| u | 233553 | 3.0% |
| 3 | 229057 | 2.9% |
| a | 227906 | 2.9% |
| J | 221861 | 2.8% |
| e | 219341 | 2.8% |
| Other values (23) | 2796557 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3345915 | |
| Dash Punctuation | 1798328 | |
| Lowercase Letter | 1798328 | |
| Uppercase Letter | 899164 | 11.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 233553 | |
| a | 227906 | |
| e | 219341 | |
| r | 163835 | |
| p | 163275 | |
| n | 145374 | |
| c | 139688 | |
| g | 78776 | 4.4% |
| y | 77194 | 4.3% |
| l | 76487 | 4.3% |
| Other values (4) | 272899 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 687310 | |
| 1 | 492781 | |
| 9 | 470677 | |
| 2 | 464364 | |
| 3 | 229057 | 6.8% |
| 6 | 208904 | 6.2% |
| 5 | 203699 | 6.1% |
| 7 | 199006 | 5.9% |
| 4 | 197260 | 5.9% |
| 8 | 192857 | 5.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 221861 | |
| M | 160822 | |
| A | 158983 | |
| S | 83068 | 9.2% |
| D | 69931 | 7.8% |
| O | 69757 | 7.8% |
| N | 68400 | 7.6% |
| F | 66342 | 7.4% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1798328 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 5144243 | |
| Latin | 2697492 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| u | 233553 | 8.7% |
| a | 227906 | 8.4% |
| J | 221861 | 8.2% |
| e | 219341 | 8.1% |
| r | 163835 | 6.1% |
| p | 163275 | 6.1% |
| M | 160822 | 6.0% |
| A | 158983 | 5.9% |
| n | 145374 | 5.4% |
| c | 139688 | 5.2% |
| Other values (12) | 862854 |
Common
| Value | Count | Frequency (%) |
| - | 1798328 | |
| 0 | 687310 | 13.4% |
| 1 | 492781 | 9.6% |
| 9 | 470677 | 9.1% |
| 2 | 464364 | 9.0% |
| 3 | 229057 | 4.5% |
| 6 | 208904 | 4.1% |
| 5 | 203699 | 4.0% |
| 7 | 199006 | 3.9% |
| 4 | 197260 | 3.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7841735 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 1798328 | |
| 0 | 687310 | 8.8% |
| 1 | 492781 | 6.3% |
| 9 | 470677 | 6.0% |
| 2 | 464364 | 5.9% |
| u | 233553 | 3.0% |
| 3 | 229057 | 2.9% |
| a | 227906 | 2.9% |
| J | 221861 | 2.8% |
| e | 219341 | 2.8% |
| Other values (23) | 2796557 |
| Distinct | 52 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.9 MiB |
| 2005 | |
|---|---|
| 2006 | |
| 2007 | |
| 2004 | |
| 2003 | |
| Other values (47) |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.000020019 |
| Min length | 4 |
Characters and Unicode
| Total characters | 3596674 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 1997 |
|---|---|
| 2nd row | 1997 |
| 3rd row | 1997 |
| 4th row | 1997 |
| 5th row | 1997 |
Common Values
| Value | Count | Frequency (%) |
| 2005 | 77525 | 8.6% |
| 2006 | 76040 | 8.5% |
| 2007 | 71876 | 8.0% |
| 2004 | 68290 | 7.6% |
| 2003 | 58193 | 6.5% |
| 1995 | 45758 | 5.1% |
| 2002 | 44391 | 4.9% |
| 1996 | 40112 | 4.5% |
| 2008 | 39540 | 4.4% |
| 1997 | 37748 | 4.2% |
| Other values (42) | 339691 |
Length
| Value | Count | Frequency (%) |
| 2005 | 77525 | 8.6% |
| 2006 | 76040 | 8.5% |
| 2007 | 71876 | 8.0% |
| 2004 | 68290 | 7.6% |
| 2003 | 58193 | 6.5% |
| 1995 | 45758 | 5.1% |
| 2002 | 44391 | 4.9% |
| 1996 | 40112 | 4.5% |
| 2008 | 39540 | 4.4% |
| 1997 | 37748 | 4.2% |
| Other values (42) | 339691 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1167176 | |
| 9 | 704676 | |
| 2 | 639911 | |
| 1 | 435726 | 12.1% |
| 5 | 125258 | 3.5% |
| 6 | 118366 | 3.3% |
| 7 | 112975 | 3.1% |
| 8 | 104656 | 2.9% |
| 4 | 102220 | 2.8% |
| 3 | 85692 | 2.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3596656 | |
| Uppercase Letter | 18 | < 0.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1167176 | |
| 9 | 704676 | |
| 2 | 639911 | |
| 1 | 435726 | 12.1% |
| 5 | 125258 | 3.5% |
| 6 | 118366 | 3.3% |
| 7 | 112975 | 3.1% |
| 8 | 104656 | 2.9% |
| 4 | 102220 | 2.8% |
| 3 | 85692 | 2.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 18 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3596656 | |
| Latin | 18 | < 0.1% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1167176 | |
| 9 | 704676 | |
| 2 | 639911 | |
| 1 | 435726 | 12.1% |
| 5 | 125258 | 3.5% |
| 6 | 118366 | 3.3% |
| 7 | 112975 | 3.1% |
| 8 | 104656 | 2.9% |
| 4 | 102220 | 2.8% |
| 3 | 85692 | 2.4% |
Latin
| Value | Count | Frequency (%) |
| A | 18 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3596674 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1167176 | |
| 9 | 704676 | |
| 2 | 639911 | |
| 1 | 435726 | 12.1% |
| 5 | 125258 | 3.5% |
| 6 | 118366 | 3.3% |
| 7 | 112975 | 3.1% |
| 8 | 104656 | 2.9% |
| 4 | 102220 | 2.8% |
| 3 | 85692 | 2.4% |
| Distinct | 412 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 110.7730781 |
| Minimum | 0 |
|---|---|
| Maximum | 569 |
| Zeros | 810 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 16 |
| Q1 | 60 |
| median | 84 |
| Q3 | 120 |
| 95-th percentile | 300 |
| Maximum | 569 |
| Range | 569 |
| Interquartile range (IQR) | 60 |
Descriptive statistics
| Standard deviation | 78.85730507 |
|---|---|
| Coefficient of variation (CV) | 0.7118815006 |
| Kurtosis | 0.1857042421 |
| Mean | 110.7730781 |
| Median Absolute Deviation (MAD) | 33 |
| Skewness | 1.120925802 |
| Sum | 99603164 |
| Variance | 6218.474562 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 84 | 230162 | |
| 60 | 89945 | 10.0% |
| 240 | 85982 | 9.6% |
| 120 | 77654 | 8.6% |
| 300 | 44727 | 5.0% |
| 180 | 28164 | 3.1% |
| 36 | 19800 | 2.2% |
| 12 | 17095 | 1.9% |
| 48 | 15621 | 1.7% |
| 72 | 9419 | 1.0% |
| Other values (402) | 280595 |
| Value | Count | Frequency (%) |
| 0 | 810 | 0.1% |
| 1 | 1608 | |
| 2 | 1809 | |
| 3 | 2112 | |
| 4 | 2173 | |
| 5 | 1866 | |
| 6 | 3054 | |
| 7 | 1761 | |
| 8 | 1693 | |
| 9 | 1875 |
| Value | Count | Frequency (%) |
| 569 | 1 | |
| 527 | 1 | |
| 511 | 1 | |
| 505 | 1 | |
| 481 | 1 | |
| 480 | 1 | |
| 461 | 1 | |
| 449 | 1 | |
| 445 | 1 | |
| 443 | 1 |
| Distinct | 599 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.41135321 |
| Minimum | 0 |
|---|---|
| Maximum | 9999 |
| Zeros | 6631 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 10 |
| 95-th percentile | 40 |
| Maximum | 9999 |
| Range | 9999 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 74.10819634 |
|---|---|
| Coefficient of variation (CV) | 6.494251379 |
| Kurtosis | 7965.288643 |
| Mean | 11.41135321 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 80.24824355 |
| Sum | 10260678 |
| Variance | 5492.024764 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 154254 | |
| 2 | 138297 | |
| 3 | 90674 | |
| 4 | 73644 | 8.2% |
| 5 | 60319 | 6.7% |
| 6 | 45759 | 5.1% |
| 10 | 31536 | 3.5% |
| 7 | 31495 | 3.5% |
| 8 | 31361 | 3.5% |
| 12 | 20822 | 2.3% |
| Other values (589) | 221003 |
| Value | Count | Frequency (%) |
| 0 | 6631 | 0.7% |
| 1 | 154254 | |
| 2 | 138297 | |
| 3 | 90674 | |
| 4 | 73644 | |
| 5 | 60319 | 6.7% |
| 6 | 45759 | 5.1% |
| 7 | 31495 | 3.5% |
| 8 | 31361 | 3.5% |
| 9 | 18131 | 2.0% |
| Value | Count | Frequency (%) |
| 9999 | 4 | |
| 9992 | 1 | < 0.1% |
| 9945 | 1 | < 0.1% |
| 9090 | 1 | < 0.1% |
| 9000 | 2 | < 0.1% |
| 8500 | 1 | < 0.1% |
| 8041 | 1 | < 0.1% |
| 8018 | 1 | < 0.1% |
| 8000 | 7 | |
| 7999 | 1 | < 0.1% |
NewExist
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 136 |
| Missing (%) | < 0.1% |
| Memory size | 6.9 MiB |
| 1.0 | |
|---|---|
| 2.0 | |
| 0.0 | 1034 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2697084 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 644869 | |
| 2.0 | 253125 | 28.2% |
| 0.0 | 1034 | 0.1% |
| (Missing) | 136 | < 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1.0 | 644869 | |
| 2.0 | 253125 | 28.2% |
| 0.0 | 1034 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 900062 | |
| . | 899028 | |
| 1 | 644869 | |
| 2 | 253125 | 9.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1798056 | |
| Other Punctuation | 899028 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 900062 | |
| 1 | 644869 | |
| 2 | 253125 | 14.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 899028 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2697084 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 900062 | |
| . | 899028 | |
| 1 | 644869 | |
| 2 | 253125 | 9.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2697084 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 900062 | |
| . | 899028 | |
| 1 | 644869 | |
| 2 | 253125 | 9.4% |
| Distinct | 246 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.430376439 |
| Minimum | 0 |
|---|---|
| Maximum | 8800 |
| Zeros | 629248 |
| Zeros (%) | 70.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 10 |
| Maximum | 8800 |
| Range | 8800 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 236.6881652 |
|---|---|
| Coefficient of variation (CV) | 28.07563422 |
| Kurtosis | 1369.91097 |
| Mean | 8.430376439 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 36.99135473 |
| Sum | 7580291 |
| Variance | 56021.28756 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 629248 | |
| 1 | 63174 | 7.0% |
| 2 | 57831 | 6.4% |
| 3 | 28806 | 3.2% |
| 4 | 20511 | 2.3% |
| 5 | 18691 | 2.1% |
| 10 | 11602 | 1.3% |
| 6 | 11009 | 1.2% |
| 8 | 7378 | 0.8% |
| 7 | 6374 | 0.7% |
| Other values (236) | 44540 | 5.0% |
| Value | Count | Frequency (%) |
| 0 | 629248 | |
| 1 | 63174 | 7.0% |
| 2 | 57831 | 6.4% |
| 3 | 28806 | 3.2% |
| 4 | 20511 | 2.3% |
| 5 | 18691 | 2.1% |
| 6 | 11009 | 1.2% |
| 7 | 6374 | 0.7% |
| 8 | 7378 | 0.8% |
| 9 | 3330 | 0.4% |
| Value | Count | Frequency (%) |
| 8800 | 648 | |
| 5621 | 1 | < 0.1% |
| 5199 | 1 | < 0.1% |
| 5085 | 1 | < 0.1% |
| 3500 | 1 | < 0.1% |
| 3100 | 1 | < 0.1% |
| 3000 | 4 | < 0.1% |
| 2515 | 1 | < 0.1% |
| 2140 | 1 | < 0.1% |
| 2020 | 1 | < 0.1% |
| Distinct | 358 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.79725723 |
| Minimum | 0 |
|---|---|
| Maximum | 9500 |
| Zeros | 440403 |
| Zeros (%) | 49.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1 |
| Q3 | 4 |
| 95-th percentile | 20 |
| Maximum | 9500 |
| Range | 9500 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 237.1205997 |
|---|---|
| Coefficient of variation (CV) | 21.96118835 |
| Kurtosis | 1362.018162 |
| Mean | 10.79725723 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 36.85481184 |
| Sum | 9708505 |
| Variance | 56226.1788 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 440403 | |
| 1 | 88790 | 9.9% |
| 2 | 76851 | 8.5% |
| 3 | 49963 | 5.6% |
| 4 | 39666 | 4.4% |
| 5 | 32627 | 3.6% |
| 6 | 23796 | 2.6% |
| 7 | 16530 | 1.8% |
| 8 | 15698 | 1.7% |
| 10 | 15438 | 1.7% |
| Other values (348) | 99402 | 11.1% |
| Value | Count | Frequency (%) |
| 0 | 440403 | |
| 1 | 88790 | 9.9% |
| 2 | 76851 | 8.5% |
| 3 | 49963 | 5.6% |
| 4 | 39666 | 4.4% |
| 5 | 32627 | 3.6% |
| 6 | 23796 | 2.6% |
| 7 | 16530 | 1.8% |
| 8 | 15698 | 1.7% |
| 9 | 8735 | 1.0% |
| Value | Count | Frequency (%) |
| 9500 | 1 | < 0.1% |
| 8800 | 648 | |
| 7250 | 1 | < 0.1% |
| 5000 | 1 | < 0.1% |
| 4441 | 1 | < 0.1% |
| 4000 | 2 | < 0.1% |
| 3900 | 1 | < 0.1% |
| 3860 | 1 | < 0.1% |
| 3225 | 1 | < 0.1% |
| 3200 | 1 | < 0.1% |
| Distinct | 2768 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2753.725933 |
| Minimum | 0 |
|---|---|
| Maximum | 99999 |
| Zeros | 208835 |
| Zeros (%) | 23.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 15805 |
| Maximum | 99999 |
| Range | 99999 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 12758.01914 |
|---|---|
| Coefficient of variation (CV) | 4.633002501 |
| Kurtosis | 24.40952381 |
| Mean | 2753.725933 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.975215215 |
| Sum | 2476051225 |
| Variance | 162767052.3 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 638554 | |
| 0 | 208835 | 23.2% |
| 78760 | 3373 | 0.4% |
| 68020 | 1921 | 0.2% |
| 50564 | 1034 | 0.1% |
| 21780 | 1003 | 0.1% |
| 25650 | 715 | 0.1% |
| 79140 | 659 | 0.1% |
| 22470 | 615 | 0.1% |
| 17998 | 606 | 0.1% |
| Other values (2758) | 41849 | 4.7% |
| Value | Count | Frequency (%) |
| 0 | 208835 | 23.2% |
| 1 | 638554 | |
| 3 | 12 | < 0.1% |
| 395 | 5 | < 0.1% |
| 399 | 3 | < 0.1% |
| 400 | 2 | < 0.1% |
| 401 | 12 | < 0.1% |
| 404 | 1 | < 0.1% |
| 407 | 34 | < 0.1% |
| 414 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 99999 | 1 | < 0.1% |
| 92006 | 4 | < 0.1% |
| 92000 | 9 | |
| 91999 | 11 | |
| 91450 | 2 | < 0.1% |
| 91446 | 1 | < 0.1% |
| 91443 | 2 | < 0.1% |
| 91435 | 1 | < 0.1% |
| 91424 | 1 | < 0.1% |
| 91423 | 2 | < 0.1% |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.9 MiB |
| 1 | |
|---|---|
| 0 | |
| 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 899164 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 899164 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 899164 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 899164 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 470654 | |
| 0 | 323167 | |
| 2 | 105343 | 11.7% |
| Distinct | 18 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 4528 |
| Missing (%) | 0.5% |
| Memory size | 6.9 MiB |
| N | |
|---|---|
| 0 | |
| Y | |
| T | 15284 |
| 1 | 23 |
| Other values (13) | 42 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 894636 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | N |
|---|---|
| 2nd row | N |
| 3rd row | N |
| 4th row | N |
| 5th row | N |
Common Values
| Value | Count | Frequency (%) |
| N | 420288 | |
| 0 | 257602 | |
| Y | 201397 | |
| T | 15284 | 1.7% |
| 1 | 23 | < 0.1% |
| R | 14 | < 0.1% |
| ` | 11 | < 0.1% |
| 2 | 6 | < 0.1% |
| C | 2 | < 0.1% |
| 5 | 1 | < 0.1% |
| Other values (8) | 8 | < 0.1% |
| (Missing) | 4528 | 0.5% |
Length
| Value | Count | Frequency (%) |
| n | 420288 | |
| 0 | 257602 | |
| y | 201397 | |
| t | 15284 | 1.7% |
| 1 | 23 | < 0.1% |
| r | 14 | < 0.1% |
| 14 | < 0.1% | |
| 2 | 6 | < 0.1% |
| c | 2 | < 0.1% |
| 5 | 1 | < 0.1% |
| Other values (5) | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 420288 | |
| 0 | 257602 | |
| Y | 201397 | |
| T | 15284 | 1.7% |
| 1 | 23 | < 0.1% |
| R | 14 | < 0.1% |
| ` | 11 | < 0.1% |
| 2 | 6 | < 0.1% |
| C | 2 | < 0.1% |
| 3 | 1 | < 0.1% |
| Other values (8) | 8 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 636987 | |
| Decimal Number | 257635 | |
| Modifier Symbol | 11 | < 0.1% |
| Other Punctuation | 2 | < 0.1% |
| Dash Punctuation | 1 | < 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 420288 | |
| Y | 201397 | |
| T | 15284 | 2.4% |
| R | 14 | < 0.1% |
| C | 2 | < 0.1% |
| A | 1 | < 0.1% |
| Q | 1 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 257602 | |
| 1 | 23 | < 0.1% |
| 2 | 6 | < 0.1% |
| 3 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| , | 1 | |
| . | 1 |
Modifier Symbol
| Value | Count | Frequency (%) |
| ` | 11 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 636987 | |
| Common | 257649 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 257602 | |
| 1 | 23 | < 0.1% |
| ` | 11 | < 0.1% |
| 2 | 6 | < 0.1% |
| 3 | 1 | < 0.1% |
| , | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| . | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
Latin
| Value | Count | Frequency (%) |
| N | 420288 | |
| Y | 201397 | |
| T | 15284 | 2.4% |
| R | 14 | < 0.1% |
| C | 2 | < 0.1% |
| A | 1 | < 0.1% |
| Q | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 894636 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 420288 | |
| 0 | 257602 | |
| Y | 201397 | |
| T | 15284 | 1.7% |
| 1 | 23 | < 0.1% |
| R | 14 | < 0.1% |
| ` | 11 | < 0.1% |
| 2 | 6 | < 0.1% |
| C | 2 | < 0.1% |
| 3 | 1 | < 0.1% |
| Other values (8) | 8 | < 0.1% |
LowDoc
Categorical
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 2582 |
| Missing (%) | 0.3% |
| Memory size | 6.9 MiB |
| N | |
|---|---|
| Y | |
| 0 | 1491 |
| C | 758 |
| S | 603 |
| Other values (3) | 573 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 896582 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Y |
|---|---|
| 2nd row | Y |
| 3rd row | N |
| 4th row | Y |
| 5th row | N |
Common Values
| Value | Count | Frequency (%) |
| N | 782822 | |
| Y | 110335 | 12.3% |
| 0 | 1491 | 0.2% |
| C | 758 | 0.1% |
| S | 603 | 0.1% |
| A | 497 | 0.1% |
| R | 75 | < 0.1% |
| 1 | 1 | < 0.1% |
| (Missing) | 2582 | 0.3% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| n | 782822 | |
| y | 110335 | 12.3% |
| 0 | 1491 | 0.2% |
| c | 758 | 0.1% |
| s | 603 | 0.1% |
| a | 497 | 0.1% |
| r | 75 | < 0.1% |
| 1 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 782822 | |
| Y | 110335 | 12.3% |
| 0 | 1491 | 0.2% |
| C | 758 | 0.1% |
| S | 603 | 0.1% |
| A | 497 | 0.1% |
| R | 75 | < 0.1% |
| 1 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 895090 | |
| Decimal Number | 1492 | 0.2% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 782822 | |
| Y | 110335 | 12.3% |
| C | 758 | 0.1% |
| S | 603 | 0.1% |
| A | 497 | 0.1% |
| R | 75 | < 0.1% |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1491 | |
| 1 | 1 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 895090 | |
| Common | 1492 | 0.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| N | 782822 | |
| Y | 110335 | 12.3% |
| C | 758 | 0.1% |
| S | 603 | 0.1% |
| A | 497 | 0.1% |
| R | 75 | < 0.1% |
Common
| Value | Count | Frequency (%) |
| 0 | 1491 | |
| 1 | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 896582 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| N | 782822 | |
| Y | 110335 | 12.3% |
| 0 | 1491 | 0.2% |
| C | 758 | 0.1% |
| S | 603 | 0.1% |
| A | 497 | 0.1% |
| R | 75 | < 0.1% |
| 1 | 1 | < 0.1% |
| Distinct | 6448 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 736465 |
| Missing (%) | 81.9% |
| Memory size | 6.9 MiB |
| 13-Mar-10 | 734 |
|---|---|
| 20-Feb-10 | 614 |
| 30-Jan-10 | 519 |
| 6-Feb-10 | 461 |
| 6-Mar-10 | 422 |
| Other values (6443) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.716304341 |
| Min length | 8 |
Characters and Unicode
| Total characters | 1418134 |
|---|---|
| Distinct characters | 33 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 861 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | 24-Jun-91 |
|---|---|
| 2nd row | 18-Apr-02 |
| 3rd row | 4-Oct-89 |
| 4th row | 26-Jun-14 |
| 5th row | 4-Oct-05 |
Common Values
| Value | Count | Frequency (%) |
| 13-Mar-10 | 734 | 0.1% |
| 20-Feb-10 | 614 | 0.1% |
| 30-Jan-10 | 519 | 0.1% |
| 6-Feb-10 | 461 | 0.1% |
| 6-Mar-10 | 422 | < 0.1% |
| 10-Jun-10 | 415 | < 0.1% |
| 20-Mar-10 | 414 | < 0.1% |
| 13-Feb-10 | 400 | < 0.1% |
| 7-Jun-10 | 350 | < 0.1% |
| 3-Jun-10 | 338 | < 0.1% |
| Other values (6438) | 158032 | 17.6% |
| (Missing) | 736465 |
Length
| Value | Count | Frequency (%) |
| 13-mar-10 | 734 | 0.5% |
| 20-feb-10 | 614 | 0.4% |
| 30-jan-10 | 519 | 0.3% |
| 6-feb-10 | 461 | 0.3% |
| 6-mar-10 | 422 | 0.3% |
| 10-jun-10 | 415 | 0.3% |
| 20-mar-10 | 414 | 0.3% |
| 13-feb-10 | 400 | 0.2% |
| 7-jun-10 | 350 | 0.2% |
| 3-jun-10 | 338 | 0.2% |
| Other values (6438) | 158032 |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 325398 | |
| 1 | 177588 | 12.5% |
| 0 | 126799 | 8.9% |
| 2 | 83425 | 5.9% |
| u | 48822 | 3.4% |
| 9 | 46885 | 3.3% |
| J | 44922 | 3.2% |
| a | 43197 | 3.0% |
| 8 | 38336 | 2.7% |
| e | 37857 | 2.7% |
| Other values (23) | 444905 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 604639 | |
| Dash Punctuation | 325398 | |
| Lowercase Letter | 325398 | |
| Uppercase Letter | 162699 | 11.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 48822 | |
| a | 43197 | |
| e | 37857 | |
| n | 30637 | |
| r | 28866 | |
| p | 28398 | |
| c | 21231 | |
| g | 16046 | 4.9% |
| y | 15627 | 4.8% |
| l | 14285 | 4.4% |
| Other values (4) | 40432 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 177588 | |
| 0 | 126799 | |
| 2 | 83425 | |
| 9 | 46885 | 7.8% |
| 8 | 38336 | 6.3% |
| 3 | 37546 | 6.2% |
| 6 | 28366 | 4.7% |
| 7 | 23654 | 3.9% |
| 4 | 22727 | 3.8% |
| 5 | 19313 | 3.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 44922 | |
| M | 31051 | |
| A | 29488 | |
| S | 14956 | 9.2% |
| F | 12352 | 7.6% |
| O | 10682 | 6.6% |
| D | 10549 | 6.5% |
| N | 8699 | 5.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 325398 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 930037 | |
| Latin | 488097 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| u | 48822 | 10.0% |
| J | 44922 | 9.2% |
| a | 43197 | 8.9% |
| e | 37857 | 7.8% |
| M | 31051 | 6.4% |
| n | 30637 | 6.3% |
| A | 29488 | 6.0% |
| r | 28866 | 5.9% |
| p | 28398 | 5.8% |
| c | 21231 | 4.3% |
| Other values (12) | 143628 |
Common
| Value | Count | Frequency (%) |
| - | 325398 | |
| 1 | 177588 | |
| 0 | 126799 | 13.6% |
| 2 | 83425 | 9.0% |
| 9 | 46885 | 5.0% |
| 8 | 38336 | 4.1% |
| 3 | 37546 | 4.0% |
| 6 | 28366 | 3.0% |
| 7 | 23654 | 2.5% |
| 4 | 22727 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1418134 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| - | 325398 | |
| 1 | 177588 | 12.5% |
| 0 | 126799 | 8.9% |
| 2 | 83425 | 5.9% |
| u | 48822 | 3.4% |
| 9 | 46885 | 3.3% |
| J | 44922 | 3.2% |
| a | 43197 | 3.0% |
| 8 | 38336 | 2.7% |
| e | 37857 | 2.7% |
| Other values (23) | 444905 |
DisbursementDate
Date
| Distinct | 8472 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 2368 |
| Missing (%) | 0.3% |
| Memory size | 6.9 MiB |
| Minimum | 1972-02-01 00:00:00 |
|---|---|
| Maximum | 2071-12-31 00:00:00 |
| Distinct | 118859 |
|---|---|
| Distinct (%) | 13.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 201154.0167 |
| Minimum | 0 |
|---|---|
| Maximum | 11446325 |
| Zeros | 196 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10000 |
| Q1 | 42000 |
| median | 100000 |
| Q3 | 238000 |
| 95-th percentile | 761892.5 |
| Maximum | 11446325 |
| Range | 11446325 |
| Interquartile range (IQR) | 196000 |
Descriptive statistics
| Standard deviation | 287640.85 |
|---|---|
| Coefficient of variation (CV) | 1.4299533 |
| Kurtosis | 35.08859907 |
| Mean | 201154.0167 |
| Median Absolute Deviation (MAD) | 70000 |
| Skewness | 3.940992083 |
| Sum | 1.808704503 × 1011 |
| Variance | 8.273725858 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 50000 | 43787 | 4.9% |
| 100000 | 36714 | 4.1% |
| 25000 | 27387 | 3.0% |
| 150000 | 23373 | 2.6% |
| 10000 | 21328 | 2.4% |
| 35000 | 14748 | 1.6% |
| 5000 | 14193 | 1.6% |
| 75000 | 13528 | 1.5% |
| 20000 | 13462 | 1.5% |
| 30000 | 12696 | 1.4% |
| Other values (118849) | 677948 |
| Value | Count | Frequency (%) |
| 0 | 196 | |
| 1 | 11 | < 0.1% |
| 2 | 3 | < 0.1% |
| 3 | 3 | < 0.1% |
| 4 | 3 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 4 | < 0.1% |
| 7 | 3 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 11446325 | 1 | |
| 11000000 | 1 | |
| 10465000 | 1 | |
| 9284449 | 1 | |
| 8995000 | 1 | |
| 8607858 | 1 | |
| 8602584 | 1 | |
| 7853275 | 1 | |
| 7699233 | 1 | |
| 7573881 | 1 |
BalanceGross
Categorical
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.9 MiB |
| $0.00 | |
|---|---|
| $12,750.00 | 1 |
| $827,875.00 | 1 |
| $25,000.00 | 1 |
| $37,100.00 | 1 |
| Other values (10) | 10 |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 6.000076738 |
| Min length | 6 |
Characters and Unicode
| Total characters | 5395053 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 14 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | $0.00 |
|---|---|
| 2nd row | $0.00 |
| 3rd row | $0.00 |
| 4th row | $0.00 |
| 5th row | $0.00 |
Common Values
| Value | Count | Frequency (%) |
| $0.00 | 899150 | |
| $12,750.00 | 1 | < 0.1% |
| $827,875.00 | 1 | < 0.1% |
| $25,000.00 | 1 | < 0.1% |
| $37,100.00 | 1 | < 0.1% |
| $43,127.00 | 1 | < 0.1% |
| $84,617.00 | 1 | < 0.1% |
| $1,760.00 | 1 | < 0.1% |
| $115,820.00 | 1 | < 0.1% |
| $996,262.00 | 1 | < 0.1% |
| Other values (5) | 5 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| 0.00 | 899150 | |
| 12,750.00 | 1 | < 0.1% |
| 827,875.00 | 1 | < 0.1% |
| 25,000.00 | 1 | < 0.1% |
| 37,100.00 | 1 | < 0.1% |
| 43,127.00 | 1 | < 0.1% |
| 84,617.00 | 1 | < 0.1% |
| 1,760.00 | 1 | < 0.1% |
| 115,820.00 | 1 | < 0.1% |
| 996,262.00 | 1 | < 0.1% |
| Other values (5) | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2697490 | |
| $ | 899164 | 16.7% |
| . | 899164 | 16.7% |
| 899164 | 16.7% | |
| , | 13 | < 0.1% |
| 1 | 11 | < 0.1% |
| 7 | 8 | < 0.1% |
| 2 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
| Other values (4) | 18 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2697548 | |
| Other Punctuation | 899177 | 16.7% |
| Currency Symbol | 899164 | 16.7% |
| Space Separator | 899164 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2697490 | |
| 1 | 11 | < 0.1% |
| 7 | 8 | < 0.1% |
| 2 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
| 5 | 6 | < 0.1% |
| 8 | 5 | < 0.1% |
| 4 | 4 | < 0.1% |
| 3 | 3 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 899164 | |
| , | 13 | < 0.1% |
Currency Symbol
| Value | Count | Frequency (%) |
| $ | 899164 |
Space Separator
| Value | Count | Frequency (%) |
| 899164 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 5395053 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2697490 | |
| $ | 899164 | 16.7% |
| . | 899164 | 16.7% |
| 899164 | 16.7% | |
| , | 13 | < 0.1% |
| 1 | 11 | < 0.1% |
| 7 | 8 | < 0.1% |
| 2 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
| Other values (4) | 18 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5395053 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2697490 | |
| $ | 899164 | 16.7% |
| . | 899164 | 16.7% |
| 899164 | 16.7% | |
| , | 13 | < 0.1% |
| 1 | 11 | < 0.1% |
| 7 | 8 | < 0.1% |
| 2 | 7 | < 0.1% |
| 6 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
| Other values (4) | 18 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1997 |
| Missing (%) | 0.2% |
| Memory size | 6.9 MiB |
| P I F | |
|---|---|
| CHGOFF |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 5.175617249 |
| Min length | 5 |
Characters and Unicode
| Total characters | 4643393 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | P I F |
|---|---|
| 2nd row | P I F |
| 3rd row | P I F |
| 4th row | P I F |
| 5th row | P I F |
Common Values
| Value | Count | Frequency (%) |
| P I F | 739609 | |
| CHGOFF | 157558 | 17.5% |
| (Missing) | 1997 | 0.2% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| p | 739609 | |
| i | 739609 | |
| f | 739609 | |
| chgoff | 157558 | 6.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1479218 | ||
| F | 1054725 | |
| P | 739609 | |
| I | 739609 | |
| C | 157558 | 3.4% |
| H | 157558 | 3.4% |
| G | 157558 | 3.4% |
| O | 157558 | 3.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 3164175 | |
| Space Separator | 1479218 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 1054725 | |
| P | 739609 | |
| I | 739609 | |
| C | 157558 | 5.0% |
| H | 157558 | 5.0% |
| G | 157558 | 5.0% |
| O | 157558 | 5.0% |
Space Separator
| Value | Count | Frequency (%) |
| 1479218 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3164175 | |
| Common | 1479218 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 1054725 | |
| P | 739609 | |
| I | 739609 | |
| C | 157558 | 5.0% |
| H | 157558 | 5.0% |
| G | 157558 | 5.0% |
| O | 157558 | 5.0% |
Common
| Value | Count | Frequency (%) |
| 1479218 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4643393 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1479218 | ||
| F | 1054725 | |
| P | 739609 | |
| I | 739609 | |
| C | 157558 | 3.4% |
| H | 157558 | 3.4% |
| G | 157558 | 3.4% |
| O | 157558 | 3.4% |
| Distinct | 83165 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13503.29513 |
| Minimum | 0 |
|---|---|
| Maximum | 3512596 |
| Zeros | 737152 |
| Zeros (%) | 82.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 64888.85 |
| Maximum | 3512596 |
| Range | 3512596 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 65152.29269 |
|---|---|
| Coefficient of variation (CV) | 4.824918072 |
| Kurtosis | 184.3191639 |
| Mean | 13503.29513 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.22096997 |
| Sum | 1.214167686 × 1010 |
| Variance | 4244821243 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 737152 | |
| 50000 | 2110 | 0.2% |
| 10000 | 1865 | 0.2% |
| 25000 | 1371 | 0.2% |
| 35000 | 1345 | 0.1% |
| 100000 | 1028 | 0.1% |
| 20000 | 594 | 0.1% |
| 30000 | 492 | 0.1% |
| 15000 | 467 | 0.1% |
| 5000 | 356 | < 0.1% |
| Other values (83155) | 152384 | 16.9% |
| Value | Count | Frequency (%) |
| 0 | 737152 | |
| 1 | 6 | < 0.1% |
| 3 | 3 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 5 | < 0.1% |
| 6 | 3 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 11 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 3512596 | 1 | |
| 2223766 | 1 | |
| 2157499 | 1 | |
| 1999999 | 1 | |
| 1961398 | 1 | |
| 1933715 | 1 | |
| 1932180 | 1 | |
| 1931439 | 1 | |
| 1926148 | 1 | |
| 1917676 | 1 |
| Distinct | 22128 |
|---|---|
| Distinct (%) | 2.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 192686.9764 |
| Minimum | 200 |
|---|---|
| Maximum | 5472000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 200 |
|---|---|
| 5-th percentile | 10000 |
| Q1 | 35000 |
| median | 90000 |
| Q3 | 225000 |
| 95-th percentile | 750000 |
| Maximum | 5472000 |
| Range | 5471800 |
| Interquartile range (IQR) | 190000 |
Descriptive statistics
| Standard deviation | 283263.3913 |
|---|---|
| Coefficient of variation (CV) | 1.470070249 |
| Kurtosis | 21.01888249 |
| Mean | 192686.9764 |
| Median Absolute Deviation (MAD) | 65000 |
| Skewness | 3.520790055 |
| Sum | 1.732571924 × 1011 |
| Variance | 8.023814885 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 50000 | 69394 | 7.7% |
| 25000 | 51258 | 5.7% |
| 100000 | 50977 | 5.7% |
| 10000 | 38366 | 4.3% |
| 150000 | 27624 | 3.1% |
| 20000 | 23434 | 2.6% |
| 35000 | 23181 | 2.6% |
| 30000 | 21004 | 2.3% |
| 5000 | 19146 | 2.1% |
| 15000 | 18472 | 2.1% |
| Other values (22118) | 556308 |
| Value | Count | Frequency (%) |
| 200 | 2 | < 0.1% |
| 300 | 1 | < 0.1% |
| 400 | 2 | < 0.1% |
| 500 | 33 | < 0.1% |
| 700 | 4 | < 0.1% |
| 800 | 4 | < 0.1% |
| 950 | 1 | < 0.1% |
| 1000 | 444 | |
| 1200 | 12 | < 0.1% |
| 1300 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 5472000 | 1 | < 0.1% |
| 5000000 | 40 | |
| 4991700 | 1 | < 0.1% |
| 4950000 | 1 | < 0.1% |
| 4908500 | 1 | < 0.1% |
| 4900000 | 2 | < 0.1% |
| 4872000 | 1 | < 0.1% |
| 4869000 | 1 | < 0.1% |
| 4830000 | 1 | < 0.1% |
| 4800000 | 1 | < 0.1% |
| Distinct | 38326 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 149488.7882 |
| Minimum | 100 |
|---|---|
| Maximum | 5472000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.9 MiB |
Quantile statistics
| Minimum | 100 |
|---|---|
| 5-th percentile | 5000 |
| Q1 | 21250 |
| median | 61250 |
| Q3 | 175000 |
| 95-th percentile | 626250 |
| Maximum | 5472000 |
| Range | 5471900 |
| Interquartile range (IQR) | 153750 |
Descriptive statistics
| Standard deviation | 228414.5615 |
|---|---|
| Coefficient of variation (CV) | 1.52797119 |
| Kurtosis | 25.32551382 |
| Mean | 149488.7882 |
| Median Absolute Deviation (MAD) | 48750 |
| Skewness | 3.675275286 |
| Sum | 1.344149367 × 1011 |
| Variance | 5.217321191 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 25000 | 49579 | 5.5% |
| 12500 | 40147 | 4.5% |
| 5000 | 31135 | 3.5% |
| 50000 | 25047 | 2.8% |
| 10000 | 17009 | 1.9% |
| 17500 | 16141 | 1.8% |
| 15000 | 14490 | 1.6% |
| 7500 | 12781 | 1.4% |
| 127500 | 11946 | 1.3% |
| 80000 | 10965 | 1.2% |
| Other values (38316) | 669924 |
| Value | Count | Frequency (%) |
| 100 | 2 | < 0.1% |
| 150 | 1 | < 0.1% |
| 200 | 2 | < 0.1% |
| 250 | 33 | < 0.1% |
| 350 | 4 | < 0.1% |
| 400 | 4 | < 0.1% |
| 475 | 1 | < 0.1% |
| 500 | 442 | |
| 600 | 12 | < 0.1% |
| 650 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 5472000 | 1 | < 0.1% |
| 5000000 | 1 | < 0.1% |
| 4869000 | 1 | < 0.1% |
| 4582000 | 1 | < 0.1% |
| 4500000 | 23 | |
| 4492530 | 1 | < 0.1% |
| 4410000 | 1 | < 0.1% |
| 4320000 | 1 | < 0.1% |
| 4050000 | 4 | < 0.1% |
| 4000000 | 13 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| LoanNr_ChkDgt | Name | City | State | Zip | Bank | BankState | NAICS | ApprovalDate | ApprovalFY | Term | NoEmp | NewExist | CreateJob | RetainedJob | FranchiseCode | UrbanRural | RevLineCr | LowDoc | ChgOffDate | DisbursementDate | DisbursementGross | BalanceGross | MIS_Status | ChgOffPrinGr | GrAppv | SBA_Appv | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1000014003 | ABC HOBBYCRAFT | EVANSVILLE | IN | 47711 | FIFTH THIRD BANK | OH | 451120 | 28-Feb-97 | 1997 | 84 | 4 | 2.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1999-02-28 | 60000.0 | $0.00 | P I F | 0.0 | 60000.0 | 48000.0 |
| 1 | 1000024006 | LANDMARK BAR & GRILLE (THE) | NEW PARIS | IN | 46526 | 1ST SOURCE BANK | IN | 722410 | 28-Feb-97 | 1997 | 60 | 2 | 2.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1997-05-31 | 40000.0 | $0.00 | P I F | 0.0 | 40000.0 | 32000.0 |
| 2 | 1000034009 | WHITLOCK DDS, TODD M. | BLOOMINGTON | IN | 47401 | GRANT COUNTY STATE BANK | IN | 621210 | 28-Feb-97 | 1997 | 180 | 7 | 1.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-12-31 | 287000.0 | $0.00 | P I F | 0.0 | 287000.0 | 215250.0 |
| 3 | 1000044001 | BIG BUCKS PAWN & JEWELRY, LLC | BROKEN ARROW | OK | 74012 | 1ST NATL BK & TR CO OF BROKEN | OK | 0 | 28-Feb-97 | 1997 | 60 | 2 | 1.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1997-06-30 | 35000.0 | $0.00 | P I F | 0.0 | 35000.0 | 28000.0 |
| 4 | 1000054004 | ANASTASIA CONFECTIONS, INC. | ORLANDO | FL | 32801 | FLORIDA BUS. DEVEL CORP | FL | 0 | 28-Feb-97 | 1997 | 240 | 14 | 1.0 | 7 | 7 | 1 | 0 | N | N | NaN | 1997-05-14 | 229000.0 | $0.00 | P I F | 0.0 | 229000.0 | 229000.0 |
| 5 | 1000084002 | B&T SCREW MACHINE COMPANY, INC | PLAINVILLE | CT | 6062 | TD BANK, NATIONAL ASSOCIATION | DE | 332721 | 28-Feb-97 | 1997 | 120 | 19 | 1.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-06-30 | 517000.0 | $0.00 | P I F | 0.0 | 517000.0 | 387750.0 |
| 6 | 1000093009 | MIDDLE ATLANTIC SPORTS CO INC | UNION | NJ | 7083 | WELLS FARGO BANK NATL ASSOC | SD | 0 | 2-Jun-80 | 1980 | 45 | 45 | 2.0 | 0 | 0 | 0 | 0 | N | N | 24-Jun-91 | 1980-07-22 | 600000.0 | $0.00 | CHGOFF | 208959.0 | 600000.0 | 499998.0 |
| 7 | 1000094005 | WEAVER PRODUCTS | SUMMERFIELD | FL | 34491 | REGIONS BANK | AL | 811118 | 28-Feb-97 | 1997 | 84 | 1 | 2.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1998-06-30 | 45000.0 | $0.00 | P I F | 0.0 | 45000.0 | 36000.0 |
| 8 | 1000104006 | TURTLE BEACH INN | PORT SAINT JOE | FL | 32456 | CENTENNIAL BANK | FL | 721310 | 28-Feb-97 | 1997 | 297 | 2 | 2.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-07-31 | 305000.0 | $0.00 | P I F | 0.0 | 305000.0 | 228750.0 |
| 9 | 1000124001 | INTEXT BUILDING SYS LLC | GLASTONBURY | CT | 6073 | WEBSTER BANK NATL ASSOC | CT | 0 | 28-Feb-97 | 1997 | 84 | 3 | 2.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1997-04-30 | 70000.0 | $0.00 | P I F | 0.0 | 70000.0 | 56000.0 |
Last rows
| LoanNr_ChkDgt | Name | City | State | Zip | Bank | BankState | NAICS | ApprovalDate | ApprovalFY | Term | NoEmp | NewExist | CreateJob | RetainedJob | FranchiseCode | UrbanRural | RevLineCr | LowDoc | ChgOffDate | DisbursementDate | DisbursementGross | BalanceGross | MIS_Status | ChgOffPrinGr | GrAppv | SBA_Appv | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 899154 | 9995423005 | LITWIN LIVERY SERVICES, INC. | CAMPBELL | OH | 44405 | JPMORGAN CHASE BANK NATL ASSOC | IL | 0 | 27-Feb-97 | 1997 | 60 | 1 | 1.0 | 0 | 0 | 1 | 0 | 0 | N | NaN | 1997-09-30 | 10000.0 | $0.00 | P I F | 0.0 | 10000.0 | 5000.0 |
| 899155 | 9995453003 | FUTURE LEADERS CENTER, INC. | SO. OZONE PARK | NY | 11420 | FLUSHING BANK | NY | 624410 | 27-Feb-97 | 1997 | 180 | 2 | 1.0 | 0 | 0 | 1 | 0 | 0 | N | NaN | 1997-06-30 | 123000.0 | $0.00 | P I F | 0.0 | 128000.0 | 96000.0 |
| 899156 | 9995473009 | FABRICATORS STEEL, INC. | BALTIMORE | MD | 21224 | BANK OF AMERICA NATL ASSOC | MD | 332431 | 27-Feb-97 | 1997 | 60 | 20 | 1.0 | 0 | 0 | 1 | 0 | 0 | N | NaN | 1997-06-30 | 50000.0 | $0.00 | P I F | 0.0 | 50000.0 | 25000.0 |
| 899157 | 9995493004 | PULLTARPS MFG. | EL CAJON | CA | 92020 | U.S. BANK NATIONAL ASSOCIATION | CA | 314912 | 27-Feb-97 | 1997 | 36 | 40 | 1.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-03-31 | 200000.0 | $0.00 | P I F | 0.0 | 200000.0 | 150000.0 |
| 899158 | 9995563001 | SHADES WINDOW TINTING AUTO ALA | IRVING | TX | 75062 | LOANS FROM OLD CLOSED LENDERS | DC | 0 | 27-Feb-97 | 1997 | 84 | 5 | 2.0 | 0 | 0 | 1 | 0 | N | Y | NaN | 1997-06-30 | 79000.0 | $0.00 | P I F | 0.0 | 79000.0 | 63200.0 |
| 899159 | 9995573004 | FABRIC FARMS | UPPER ARLINGTON | OH | 43221 | JPMORGAN CHASE BANK NATL ASSOC | IL | 451120 | 27-Feb-97 | 1997 | 60 | 6 | 1.0 | 0 | 0 | 1 | 0 | 0 | N | NaN | 1997-09-30 | 70000.0 | $0.00 | P I F | 0.0 | 70000.0 | 56000.0 |
| 899160 | 9995603000 | FABRIC FARMS | COLUMBUS | OH | 43221 | JPMORGAN CHASE BANK NATL ASSOC | IL | 451130 | 27-Feb-97 | 1997 | 60 | 6 | 1.0 | 0 | 0 | 1 | 0 | Y | N | NaN | 1997-10-31 | 85000.0 | $0.00 | P I F | 0.0 | 85000.0 | 42500.0 |
| 899161 | 9995613003 | RADCO MANUFACTURING CO.,INC. | SANTA MARIA | CA | 93455 | RABOBANK, NATIONAL ASSOCIATION | CA | 332321 | 27-Feb-97 | 1997 | 108 | 26 | 1.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-09-30 | 300000.0 | $0.00 | P I F | 0.0 | 300000.0 | 225000.0 |
| 899162 | 9995973006 | MARUTAMA HAWAII, INC. | HONOLULU | HI | 96830 | BANK OF HAWAII | HI | 0 | 27-Feb-97 | 1997 | 60 | 6 | 1.0 | 0 | 0 | 1 | 0 | N | Y | 8-Mar-00 | 1997-03-31 | 75000.0 | $0.00 | CHGOFF | 46383.0 | 75000.0 | 60000.0 |
| 899163 | 9996003010 | PACIFIC TRADEWINDS FAN & LIGHT | KAILUA | HI | 96734 | CENTRAL PACIFIC BANK | HI | 0 | 27-Feb-97 | 1997 | 48 | 1 | 2.0 | 0 | 0 | 1 | 0 | N | N | NaN | 1997-05-31 | 30000.0 | $0.00 | P I F | 0.0 | 30000.0 | 24000.0 |